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Abstract 

We study a simple model of the stochastic information filtering, in 
a randomly organized information system. For simplest versions of the 
model it appears to be possible to describe the filtering dynamics in 
terms of the master equations. Exact analytical results for these equa- 
tions and results of numerical investigation of the dynamical features 
of the filter are presented. 

1 Introduction 

The main task of many modern information technologies is to in- 
crease the velocity of information processing. The important theoret- 
ical problem in this region is to find the basic principles of efficient 
organization of information flows. One of the possible approaches for 
its solution is developed in the framework of mathematical model- 
ing of universal self-organization mechanisms in complex dynamical 
systems. 

Often in this investigation, it appears to be useful to have some 
physical picture of phenomena under consideration. Usually, one can 
imagine the information system, as a net of the connected information 
sources. Its physical role is to realize interaction between suppliers 
and users of information. Although this interaction forms laws of the 
system evolution, in many situations it can be presented effectively 
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as some specific of elementary dynamical rules for the sources in the 
model of information net. 

Recently a model of this type was proposed by A. Capocci, F.Slanina 
and Y.-C. Zhnag Q. It describes ranking and filtration of information, 
being widespread procedures of information processing. The investi- 
gated in [GJ model is a 1-dimensional stochastic dynamical system with 
nearest neighbor interaction. 

The structure of many information system in reality is very com- 
plex, and frequently, it can be better presented by a random graph 
than by 1-dimensional chain. Experience in studies of physical phe- 
nomena shows that normally the interaction structure influences es- 
sentially on the system behavior. Therefore it seems to be interest- 
ing to investigate the random neighbor version of the proposed in [p 
model We present in this paper the numerical and analytical results 
obtained for information filter dynamics of such a kind. 

2 Formulation of model 

We will consider a version of the filter dynamics without a fixed inter- 
action topology. The simplest modification of the model Q could be 
formulated in the following way. There are n elements with charac- 
teristics called qualities and being a number from interval [0,1]. The 
initial state of the system is chosen at random. The state at the time 
point t + 1 is obtained as follows. One chooses at time point t some 
2 random elements and at the time point t + 1 the qualities of both 
elements will be equal to the same number Q(t + 1). We consider two 
versions of the model - model A (MA) and model B (MB). In the MA 
the elements can be chosen arbitrarily. In the MB the quality of the 
chosen elements must be different, and when there are not different 
elements its state does not change. If the qualities of elements chosen 
at time t are q and p then Q(t + 1) = q with probability = p, and 
Q(t + 1) = p with probability = A . If the qualities of the chosen 
elements were the same, it does not change. Thus, in both models 
the state of the system becomes stable if all of its elements are of the 
same quality. The dynamics of proposed models can be investigated 
by exact analytical methods. We demonstrate it for the initial condi- 
tion of a special case. We suppose that in the initial state /elements 

1 The statement of this problem was formed in discussion with Y.-C. Znahg of the 
presented in |jj principles of information filtering. 
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are of the quality p and m = n — I of the quality q. This dynamical 
situation arouses in the general case until the system reach the stable 
state. The qualitative description of dynamics of such kind can be 
obtained with the help of master equations. 

3 Numerical results 

We have chosen the MA model for numerical analysis as it is more 
complicated for analytical studies. Numerical experiment have been 
performed with the help of a distributed C program running on Solaris 
cluster, calculating all possible combinations of model variations, ini- 
tial distributions, and number of elements; with following analysis of 
obtained results in Mathematica to detect model behaviour patterns. 

The main characteristic of MA dynamics appears to be the scale 
invariance in respect to the number of elements N, which is indepen- 
dent from the model variations and initial conditions. We studied the 
system's average value of elements quality 

A(t,N) =< Q(t,N) > 

where Q(t, N) denotes the mean value of the system's element quality 
at the time point t and < ... > means averaging over ensembles of 
system evolution. Our results show that we can write A(t, N) in the 
following form: 

A(t,N)=A inv (J^j (1) 

if the average values A(0, N) given by initial conditions are indepen- 
dent from N. Dynamics of A(t, N) for the MA model with two types 
of initial states are shown on the fig. 1. We considered the systems 
with initial states equidistributed on the interval [0,1] (A(0, N) = 0, 5) 
and initial states with 10% elements of the quality 0,9 and 90% of el- 
ements of the quality 0,1 (A(0, N) = 0, 1 ■ 0, 9 + 0, 9 • 01 = 0, 18). For 
large systems with N > 100 we obtained a close correspondence be- 
tween numerical results for A(t, N) and ([!]). For equidistributed initial 
condition the correspondence is shown on the fig. 2. For this type of 
initial condition we have observed 

A i „„(l)= 1 -(2 + l)" 1 . (2) 
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For small N there is a certain deviation of experimental curve A(t, N) 
from (Q), (see fig. 3), which is caused by strong effect of random 
fluctuations of systems with small number of elements. We have also 
calculated the average deviation from the average value 

D(t,N) = ^< {Q(t,N)-A{t,N)f >. 

The typical curve for D(t, N) is presented on the fig. 4. 

For initial state equidistributed on interval [0,1] we investigated 
the averaged value of number M(t, N) of different elements in the 
system The obtained curve for M(t,N)/N seems to be very well ap- 
proximated by the function S(t,N) = N/(t + N) (see fig. 5). We 
modified dynamics of the MA by adding Darwin selection to the sys- 
tem. By using the model of natural selection mechanism proposed 
in Q, it was presented as follows: at each time step the state of the 
system is changed by the MA rules and after that the element with 
lowest quality in the system is replaced by the one with an arbitrarily 
chosen quality in the interval [0,1]. 

The influence of the selection mechanism on the system dynam- 
ics can be analyzed by comparison of the process in usual MA and 
MA with Darwin selection for the same initial conditions as it is pre- 
sented on the fig. 6 for a system with elements having one of two 
quality values and different initial distributions of "bad" (quality 0.9) 
and "good" (quality 0.1) elements. We see that the Darwin selection 
speeds up the filtration process for ^4(0, N) < 0, 5 and slows it down 
for A(0,N) > 0,5. 



4 Master equations 

We denote the state of the system having k elements of the quality p 
and n — k elements of the quality q at the time point t as {k; t}. Let 
Pfc(i) be the probability of the state {k; t}. If at the time point t + 1 
the system state is {k, t + 1} then it was in one of the states {k — 1; t}, 
{k; t} or {k + 1; t} at the previous moment t. The probability P2,o{k) 
to choose two elements of quality p in the state {k; t} is 

fc(fc-l) 
n[n — 1) 

The probability Po,2(&) to choose two elements of quality q is 

(n- k)(n-k- 1) 

Po,2\k) = -, -r . 

n(n — 1) 
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The probability Pi t i(k) to choose one element of quality p and one of 
quality q is 

„ 2k(n-k) 
Pl,l(k) = -7 7T- 

n(n — I) 

For k 7^ 0, k ^ n , the probabilities P{{k\ t}, {k r ; t + 1}) of tran- 
sition {k; t} i — ► {k 1 ; t + 1} in the MA can be presented as 

P({k;t}, {k;t + 1}) = P 2t0 (k) + P Q , 2 (k), 

P({k + l;t},{k;t + l})=P 1A (k + l) f i, 

P({k - 1; t}, {k; t + 1}) = Pi,i(fc - 1)A 

where A = p/(p + q), fJ, = q/(p + q). For k = or k = n, the state 
{k; t} of the system is stable, hence 

P({0; t}, {0; i + 1}) = P({n; t}, {n; t + 1}) = 1. 

Thus, we can write the master equation for probability P k (t) as follows: 

P k (t + 1) = AP fc _ 1 (t)P 1)1 (fc - 1)(1 - <5 fe0 )+ 

+ f JLP k+1 (t)P 1 , 1 (k + 1)(1 - <5 fcn ) + P fe (t)P , 2 (fc)[(l - 4n-l)(l " 4n)] + 

+P fc (t)P 2 ,o(fc)[(l-^ 1 )(l-(5 fe o)] 

Substituting the values of probabilities P2 t o(k), Po,2(^)> pL,i(fc) we 
have 

2(k - l)(n - + 1)A 

Pfc (t + 1) = 7 TT ; Pfe-i *)(1 - Sko) + 

n[n — 1) 

2(fc + l)(n-fc-l)/x p 

+ / 7T Pjfc+l(*)(l-Ofcn)+ (3) 

n(n — 1) 

(n-fc)(n-fc-l)+A;(fc-l) D 
+ ^1) Pk{t) - 

By similar arguments one obtains the following master equations 
for the MB: 

P (t + l)= f iP 1 (t)+P (t), 

P k (t + 1) = APfc_i(t)(l - <S M ) + /xP fc+ i(t)(l - 5 fc>n _i), for < fc < n, (4) 

P n (t+l)=AP„_i(t)+P„(t). 

The equations (|),(|) can be used as background of analytical in- 
vestigations of MA and MB. We demonstrate how they allow one to 
exactly calculate the important characteristics of model's dynamics. 
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5 Exact results for MA 

The simplest problem for MA could be to find the stationary solution 
of the master equation (Q). In vector writing it looks like: 

Pit) - Pit - 1) = r~ — -AVPit - 1) for t > (5) 

n[n — 1) 

where P(t) is the vector with components {P(t)}k = Pk(t), k = 
0, 1, ...,n, and A,V are the matrices: 

Vij = 5iji(n - i), Aij = - A<% + i - 

The stationary solution P(t) = P of (||) satisfies the equation: 

AVP = (6) 

Since det A ^ 0, it follows from (||) that VP = 0. Hence, the solution 
of (^) can be written as: 

{P}i = 5 i0P + 5 in (l- p), 0< P <1 (7) 

Now, the problem is to find p as the function of initial probability 
distribution Pfc(0) = {Po}k- If we denote 

(/ - -J-—AV) = G 

n(n — 1) 

then in virtue of (||), the vector P is expressed through the vector Pq 
in the following way 



P = G as P , where G as = lim G l . (8) 

t^oo 

It follows from (|7]),(||) that the matrix G as must be of the form 

{G as }ik = 5i0 x k + &inUk 

The matrix G has the property: 

n 

Y J {G}ik = 1 for < k < n, 

i=0 

and the same equality must be fulfilled for G as too, hence x^ + Vk = 1- 
Therefore 

{G as }ik = fii.oXk + 5 in (l - x k ). 
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The vectors P(°\ P^ n ) with components pjf^ = <5ofc, Pu = $ n k are 
the eigen ones for matrix G: GP® = P (0) , GP^ = PK Hence, 
G as p(°) = P(°), G as P^ = P( n \ and 

x = 1, x n = 0. (9) 
Taking into account that 

G a sG — G as , 

we obtain: 

xiGik = x k . 

Thus, we have to solve the equation xAV = which can be written 
in components as 

Xj — Xxj + i — fj,Xj-i = 0, < j < n (10) 

The equations ([[(]) with boundary condition @ coincide with ones 
of the classical problem of the player's losing Q(see for example eq. 
(2.1), (2.2) of capital XIV in @). The solution of (|), © has the 
form 

.k- 



\n—k ,,n—k . ,k , ,r 
, A — jJL uu — LO 



Xk = U 

\ n - fl n \-<jJ n 

Here, we used the convenient notation u = fi/X. 

In virtue of (0),(||), the parameter a defining the stationary so- 
lution of the master equation can be expressed in terms of initial 
probability distribution Pfc(0) as follows 

P = ±pm*k = ^^f (id 

k=0 

We denoted P(uj) the generating function 

n 

PM = ]Tp fc (o)uA 

k=0 

For the homogeneous initial distribution Pfc(0) = we have: 

1 1 - oj n+1 1 - (1 + n)u) n + nuj n+1 

y ' n + 1 1-uj ' F (n + 1)(1 -w)(l -0J n ) 



2 It was pointed out us by A.Capocci. 
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For large n we get 

(u - l)(n + 1) 
(n + 1)(1 - oj) 

l + e x (x - 1) Q 3 (n) x 

^ = Tic — r\ — + ' lf w = 1 + -■ 

x(e :c — 1) n n 

Here, the function Qi(n), Q2(n), Qz{n) have finite limits for n — > oo. 



6 Exact solutions for MB 

Now, we consider the master equations for MB. For the system with 
elements of qualities p and q this model can be considered as a refor- 
mulated classical player's losing model Q. Let us denote P(z,u) the 
generating function of probability distribution Pk(t): 



k=0 t=0 



z h u\ 



and will use the notations 

A{u) = P(0, u) = Mt)t u , B(u) = -j—Piz, u) 
t=o n - oz 



= J2Pn(t)e. (12) 

=0 t=o 



(xz + ^ P(z, u)+(l-Xz- ^ [A(«)+^ n B(u)] 



Then the equations (|4|) can be rewritten for generating function as 
P{z,u) -P(z,0) _ /\_ , //^ _ A , x _ 

or in an equivalent form: 

P{z, u) [z - u(Xz 2 + /*)] = zP(z, 0) + u (z - Az 2 - //) + z n B(u)] (13) 

If the functions A(u), B(u) are known, the solution of equation ( |l3| ) 
is the following 

p , ^ = zPjz, 0)+u(z- Xz 2 - n) [A{u) + z n B(u)} 

z — u(Xz 2 + fx) 
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Let us denote a\(u), a^u) the solutions of equation z — u(Xz 2 + 
M) = 0: 

1 - VI - 4u 2 A/u .. 1 + v/1 - 4n 2 A/i 
° 1(U) = 2^A ' a2{u) = 2~uA ' 

then substituting z = a\ = a\{u) and z = a 2 = «2(^) in (13) we 
obtain two equations for A = A(u), B = B(u) 

A + a^B = - 2 aiPl -, A + a%B= - (14) 

u(Aaf — «i + /ij u(Aa2 — «2 + 

Here we denoted Pi = P(oti(u),0), i = 1,2. The solution of the 
equations fli~4|) has the form: 

^' (i-^K-c.?) ' B( " )= (i (15) 

Thus, we have obtained 

, , zP(z,0) , , 

p (z, u) = - — . ; ' + i6 

[z - u(Xz 2 + n)) 
u{z- Xz 2 - fjt) {(z n - ti$)Pi - (z n - a 1 l)P 2 ] 
+ (1 -u)[z- u{Xz 2 + //)](<*? - oQ) 

It is the presentation in terms of generating function of exact solution 
for the master equation (Q) for the finite system with n elements. 

For the limit of infinite system n — * 00 we obtain from (13) more 
simple equation: 

P(z,u)[z - u(Xz 2 + //)] = zP{z,$)+u(z-Xz 2 - ^ A(u). (17) 
It follows from ( |l7| ) that 

P(a(«),0) 



1-u 



where a{u) = a\(u) is the analytical at the point u = solution of 
equation z — u(Xz 2 + jx) = 0. Thus, the solution of Ql7|) has the form: 



_ £(!_- u)P(z, 0)+u(z- Xz 2 - /j) P(a(u),0 
U,Mj " (l-u)[z-u(Xz 2 + f i)} 

The moments M^ n \t) of the considered distribution function P n (t) 
defined as 

n 

M^ s \t) = Y / Pk(t)k s , 

k=0 
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can be found by differentiating the generating function P(z,u). Par- 
ticularly, 



U 



z=l,u=0 

+ M\t) 



z=l,u=0 

Hence for M 1 (t), M 2 (t) can be expressed in terms of power series of 
coefficients of the functions 



d 

mi(u) = —P(z,u) 



2=1 



d 2 

, m 2 (u) = -^P(z,u) 



2 = 1 



Since 



P(1,0) = 1, faP(z,0) 



M«(0), 



2 = 1 



d 2 



m< 2 »(o) + mW(o). 



2 = 1 



It follows from (18) that for infinite system 

AfW(O) (2A-l)n(l-P(q,0)) 
mi(«j = — h 



1 - u 



{1-uf 



. , MP)(0) 2(2A-1)uMW(0) 2«(l + 4A 2 u-A(l + 3u))(l-P(a,0)) 
m2W = — 1 — h 



1-u 



(1-uf 



(1 - uf 



and 



M«(£) = M«(0) + ^-4i + (2A-l) [1 _ Xo ] t+ W-W cHt)), 
M&>(t) = M( 2 )(0) - i^4i - ^4 2 + 



+ 



2(2A- 1) (mW(0) 



+ 



A 

(1-A) 



A 



+ (l-2A) 2 (l- X o)i 2 + 



A 2 

V , ) ~4A 2 (l-xo) 
■0 2 (t) 



t+ 



[4A(1-A)]i 

3 
£2 



Here Oj(£) = 0, i = 1, 2 are limited in the region of large t functions: 

\Oi(t)\<C for£»l, 
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where C is a constant, and 

d d 2 
Xo = -P(^,0), Xi = -^;P(x,0)\ x=OJ , X2 = j^2 P ( x >ty\x=w- 

We call filtering to be finished at time point t if at this moment 
the quality of all the element became first to be equal. Let us denote 
p( p )(i), p( q \t) the probabilities that filtering is finished at time point 
t with quality of all the element being equal to p and q consequently. 
We have 

P (t + 1) = pW(t + 1) + P (t), P n (t + 1) = P (q \t + 1) + P n (t). 

These relations can be rewritten on terms of generating functions as 

V {p \u) = A(u)(l - u), V iq) (u) = B{u){l - u). 

where A(u), B{u) are defined in (P) and V^ p) {u) (V^ p) {u)) is the 
generating function of probabilities PW (t) (p( p ) (t) ) : 

OO CO 

V {p) (u) = Y / P {p) (t)u t , V {q) (u)=Y J P {q) {t> t - 

t=0 t=0 
In virtue of (|l5|) , it follows that 



(p) <P-a|P v { q){u) = El^ (19) 



Hence, for generating function P(m) of probability P(t) = P( p \t) + 
p( q \t) that filtering is finished at the moment £ (called in Q search 
time) we obtain: 

Ws f^ = "-^-^"> (20, 

For the mean value T of the filtering (search) time we obtain: 

= ^(«)l«=l = -J^- ~ n {ti _ X){1 _ u n ) ( 21 ) 

This result agrees with known formula for mean time of play in the 
problem of player's losing (see for example capital XIV, formula (3.5) 
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in 0). For the homogeneous initial probability distribution -Pfc(O) = 
l/(n + 1) we have: 

(1 + - a;") - n(l -u)(l+ un 
2(n + l)(/i-A)(l-w)(l-a;™) ' 1 ' 

and for large n: T = n/2\n — v\ + 0(n), \0(n)\ < C, where C is a 
constant. 

The stationary solution of the master equation for the MB can be 
found as the residue of the pole in the point u = 1 of P(z, u). For the 
generating function P s t(z) = J2k=o Pk% k of stationary distribution 
we obtain from ( |l~6|) the following result 

P M = r es„ rf [P(,, „)] = ( --"- 1 > P( : ) -,'-^^-") . (23) 

UJ n — 1 

Comparing (]7|) , (|TT[) and (p3[), we see that the stationary solutions of 
the master equations for MA and MB coincide. 



7 Conclusion 

We studied simple processes of information filtering generated by con- 
sequent comparisons of two randomly chosen elementary information 
units. For the simplest version of MB the mean search time T is 
given by (|22|). It follows directly from dynamical rules that the search 
time for MA must be larger. The search time is dependent on initial 
distribution too. Nevertheless, the obtained numerical and analytical 
results shows that for large system the search time T and the number 
iV of the elements obey the relation T/N = C, where C is indepen- 
dent on T and N and depend on initial distribution and qualities of 
elements only. For the MB with elements of quality p, q and homo- 
geneous initial distribution Pfc(O) = l/(n + 1), quantity C looks like 
C = {p + q)/2\p — q\ and and becomes large for small \p — q\. 

The introduced in |l]] characteristic of information filtering called 
efficiency R is defined as the rank of selected value in the starting 
configuration. The efficiency for the MA and MB with elements of 
the quality p and q can be estimated by using the stationary solution 
of the master equations (0). If p > q the quantity 1Z = (1 — p)/p 
is the ratio of probabilities to select the elements with qualities p 
and q. It can be considered as a measure of filtration efficiency. For 
homogeneous initial distribution and large number N of the system 
elements 1Z = N(l — lo). 
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The information filter investigated in [JIJ is characterized for large 
systems by search time T ~ N 2 and efficiency R ~ IniV. Compari- 
son these results with our ones shows that the information filtration 
respecting one dimensional organization of information units appears 
to be slower and less effective as one based on the random choosing 
algorithm. 
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* two-value model, 0.1 & 9, 90% small, N = 100 
o two-value model, 0.1 & 0.9, 90% small, N = 1,000 
+ two-value model, 0.1 & 0.9, 90% small, N = 10,000 
— random value model,, Sfluidigtributed. N = 1,000 
■ - random value model equidistributed. N - 1 0,000 



Fig. 1 
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X X X X random value model, equidistributed. N = 500 

O O O O random value model, equidistributed. N = 5,000. scaled (X = X/1 0) 

• • • • random value model, equidistributed. N - 1 00.000, scaled f< - W20O] 



Fig. 2 
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